Measuring Similarity between Graphs Based on the Levenshtein Distance

نویسندگان

  • Bin Cao
  • Ying Li
  • Jianwei Yin
چکیده

Graph data has been commonly used and widely researched both in academia and industry for many applications. And measuring similarity between graphs (i.e., graph matching) is the essential step for graph searching, pattern recognition and machine vision. At present, the most widely used approach to address the graph matching problem is graph edit distance (GED). However, the computation complexity of GED is expensive and it takes unacceptable time when the graph becomes larger. Generally, graph could be canonical labeled by some sort of strings and we use the depth-first search (DFS) code as our canonical labeling system. Based on DFS codes, combining the Levenshtein distance (i.e., string edit distance, SED), we proposed a novel method for similarity measurement of graphs. Processing and calculating the distance between two DFS codes, we turned the graph matching problem into string matching, which gains great improvement on the matching performance. The experimental results prove its usefulness.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Measuring of Strategies' Similarity in Automated Negotiation

—Negotiation is a process between self-interested agents in ecommerce trying to reach an agreement on one or multi issues. The outcome of the negotiation depends on several parameters such as the agents' strategies and the knowledge one agent has about the opponents. One way for discovering opponent's strategy is to find the similarity between strategies. In this paper we present a simple model...

متن کامل

A Knowledge-Rich Approach to Measuring the Similarity between Bulgarian and Russian Words

We propose a novel knowledge-rich approach to measuring the similarity between a pair of words. The algorithm is tailored to Bulgarian and Russian and takes into account the orthographic and the phonetic correspondences between the two Slavic languages: it combines lemmatization, hand-crafted transformation rules, and weighted Levenshtein distance. The experimental results show an 11-pt interpo...

متن کامل

Random Projection and Geometrization of String Distance Metrics

Edit distance is not the only approach how distance between two character sequences can be calculated. Strings can be also compared in somewhat subtler geometric ways. A procedure inspired by Random Indexing can attribute an D-dimensional geometric coordinate to any character N-gram present in the corpus and can subsequently represent the word as a sum of N-gram fragments which the string conta...

متن کامل

Measuring the Similarity of Trajectories Using Fuzzy Theory

In recent years, with the advancement of positioning systems, access to a large amount of movement data is provided. Among the methods of discovering knowledge from this type of data is to measure the similarity of trajectories resulting from the movement of objects. Similarity measurement has also been used in other data mining methods such as classification and clustering and is currently, an...

متن کامل

Measuring Musical Rhythm Similarity: Edit Distance versus Minimum-Weight Many-to-Many Matchings

Musical rhythms are represented as binary symbol sequences of sounded and silent pulses of unit-duration. A measure of distance (dissimilarity) between a pair of rhythms commonly used in music information retrieval, music perception, and musicology is the edit (Levenshtein) distance, defined as the minimum number of symbol insertions, deletions, and substitutions needed to transform one rhythm ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013